###library(“dplyr”) ###library(“stringr”) ###library(“ggplot2”)

Introduction

The rise of digital technology as changed the way we consume media, and the publishing industry is no exception. With growing trends of ebooks, many companies have started publishing books online for readers to download and access. Additionally, there are online companies such as Audible that produces spoken audio content. To say that the medium in which we consume media is an understatement.

In this report, we will look at recent shifts and patterns from video discs, sound discs, physical books to ebooks. This report aims to provide a comprehensive overview of the current state of the book market and the growing trend towards ebooks.

Summary Information

#####my_dataSet <-read.csv(“~/Desktop/2022-2023-All-Checkouts-SPL-Data.csv”, stringsAsFactors = FALSE)

Were people using a digital platform or physical platform for books?

total_digital_usage <- my_dataSet %>% filter(UsageClass == “Digital”)

total_physical_usage <- my_dataSet %>% filter(UsageClass == “Physical”)

How many total checkouts were there total from 2022-2023?

total_checkouts <- my_dataSet %>% summarize(total_all = sum(Checkouts))

How many ebook checkouts were there from 2022-2023?

total_ebook_checkouts <- my_dataSet %>% filter(MaterialType == “EBOOK”) %>% summarize(total_ebook = sum(Checkouts))

answer1 <- total_ebook_checkouts/ total_checkouts return(answer1)

How many physical book checkouts were there from 2022-2023?

total_book_checkouts <- my_dataSet %>% filter(MaterialType == “BOOK”) %>% summarize(total_book = sum(Checkouts))

answer2 <- total_book_checkouts/ total_checkouts return(answer2)

How many audio book checkouts were there from 2022-2023?

total_audiobook_checkouts <- my_dataSet %>% filter(MaterialType == “AUDIOBOOK”) %>% summarize(total_audiobook = sum(Checkouts))

answer3 <- total_audiobook_checkouts/ total_checkouts return(answer3)

How many sound disc checkouts were there from 2022-2023?

total_sounddisc_checkouts <- my_dataSet %>% filter(MaterialType == “SOUNDDISC”) %>% summarize(total_sounddisc = sum(Checkouts))

answer4 <- total_sounddisc_checkouts/ total_checkouts return(answer4)

How many video disc checkouts were there from 2022-2023?

total_videodisc_checkouts <- my_dataSet %>% filter(MaterialType == “VIDEODISC”) %>% summarize(total_videodisc = sum(Checkouts))

answer5 <- total_videodisc_checkouts/ total_checkouts return(answer5)

With our findings, it will help us draw closer to a conclusion. The first method finds out whether the item being checked out was either physical or digital (falls into one or the other). With this we can see that for digital items there was 1,331,024 and for physical it was 1,425,990. My next method obtains the total number of checkouts from all the various types of forms. We know from this method that there were 9,196,991 checkouts from all of 2022 through the first month of 2023 in the Seattle area. Now this number allows us to compare the percentage in which each checkout was from. For video discs, there was a total of 623,259 checkouts which in comparison to total checkouts, video discs makes up 6.78% of total checkout from the data. For sound discs, there was a total of 202,984 which makes up only 2.20% of total checkouts. Next, audio books took 23.40% of all total checkouts. For physical books, there was a total of 3,147,727 checkouts which acquaints for 34.23% of all checkouts. Lastly, with ebooks, there were 2,989,063 checkouts making up 32.50%. These numbers help us make note of which form is most popular amongst the Seattle demographic. It also gives us a glimpse of potential trends and shifts in reading.

The Dataset

  • Who collected/published the data?
    This data is published from the Seattle Public Library.

  • What are the parameters of the data (dates, number of checkouts, kinds of books, etc.)?
    The parameters of this data includes usage class, checkout type, material type, checkout year, checkout month, checkouts, title, ISBN, creator, subjects, publisher, and publication year.

  • How was the data collected or generated?
    The data was collected through various ways. An example is through book checkouts in Seattle libraries.

  • Why was the data collected?
    As stated on the SPL website, the data was collected to “build our collections to reflect current trends and changing community needs…”

  • What, if any, ethical questions do you need to consider when working with this data?
    As with most data, there is a privacy concern. Is there a violation of privacy if whatever book we are checking out is kept in database. Is this data being used for what it was intended to be used for?

  • What are possible limitations or problems with this data? (at least 200 words). Like mentioned above, there could be potential problems with privacy concerns. Some people might not want to have others know what books they are reading or what books they like to read. Additionally, there is limitation as this data set is just for the Seattle area. Another data set in New York or Florida could be completely different. This makes the data set not representative of the United States but just solely for Seattle. Another problem with this data set is that even though it accounts for checkouts, doesn’t necessarily mean that people are reading the full book, some may read a chapter and dislike it then return it.

ncol(my_dataSet) 12 columns nrow(my_dataSet) 2,757,014 rows

Visual 3

With my last chart, I wanted to compare the total checkouts by all other forms. In this we can see that sound disc and video disc are slowly starting to fall off the radar. It appears that ebooks, books, and audio books, are the most common medium in which readers read with. Years ago, this bar graph would most likely look a lot more different than how it is now.